Skip to content

🛠️ Agent Skill Development

In autonomous agent architectures, skills (or tools) represent the execution boundary of the system. While LLMs excel at processing text, they cannot natively query databases, write files, or execute API calls. Agent skills act as the interface, allowing the model's cognitive plans to translate into concrete actions in the physical workspace.

For implementation examples, see:

  • [OpenClaw Gateway Reference Guide](../OpenClaw Gateway.md)
  • [Hermes Agent Reference Guide](../Hermes Agent.md)

📐 1. Mechanics of Skill Binding & Schema Generation

When equipping an agent with a Python function, the framework translates the code definition into a JSON schema descriptor that the LLM reads during its tool selection phase.

A. Docstrings as API Documentation

Models read function names, argument type hints, and docstrings as their user manuals. If your docstrings are ambiguous, the model will pass incorrect or hallucinated parameters:

python
def query_database(sql: str) -> str:
    """Executes a SELECT query on the PostgreSQL database.
    
    Args:
        sql: A valid, read-only SELECT SQL query string.
    """
    # Execution logic here...
    return "Result string"

B. Argument Validation Layers

To prevent LLM parameter hallucinations from corrupting execution pipelines, tools must route incoming arguments through strict validation layers (typically Pydantic v2 schemas) before running the underlying business logic. See Structured Outputs & Type Safety (M05) for data schema enforcement strategies.


⚖️ 2. Tool Granularity: Micro-Tools vs. Macro-Tools

Choosing the appropriate abstraction boundary for your tools directly impacts latency, execution cost, and agent success rates.

  • Micro-Tools (e.g., write_line_to_file, change_character_at_index):
    • Pros: Conserves context window; extremely easy to unit test.
    • Cons: Requires the agent to run many steps to complete simple tasks (e.g., writing a 50-line file takes 50 loop cycles), increasing token consumption and execution costs.
  • Macro-Tools (e.g., refactor_module, compile_and_test_workspace):
    • Pros: Fast execution; completes complex, multi-file operations in a single step.
    • Cons: Prone to parameter hallucinations; harder to write clean assertion tests; carries higher security risks.

⚙️ 3. Execution Context & Parameter Injection

In production environments, tools often need access to system metadata (such as the current user's session token, database connection pool, or trace IDs) that should never be exposed to the LLM's context window.

  • System Parameters (Injected Arguments): Frameworks configure tools to intercept execution calls and inject active session variables directly at runtime, preventing the model from hallucinating or hijacking credentials.
  • State-Aware Tools: Tools that can modify the agent's active execution variables, such as updating local memory caches or writing progress status checkpoints.

🛡️ 4. Sandbox Isolation & Security Envelopes

Because macro-tools execute commands natively on the host workstation, prompt injection attacks can result in catastrophic actions.

  • Privilege Restriction: Run agent processes under non-root, unprivileged system accounts.
  • Argument Sanitization: Escape all strings in shell and SQL command builders. Use parameterized queries rather than raw string concatenations to block injection vectors.
  • MicroVMs & Containers: Wrap tool runtimes in sandboxed environments. For design patterns, see MicroVM Sandboxing & Agent Security (M09).

💡 5. Practical Best Practices

  1. Strict Return Values: Tools should always return structured text (JSON or plain markdown strings) rather than binary objects, allowing the agent to read and evaluate the outcome of the action.
  2. Explicit Error Logging: If a function fails (e.g. database connection times out), catch the error and return the stack trace back to the agent as text. This allows the model to analyze the traceback and self-correct.